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ABSTRACT 

We use a sample of about 48,000 SDSS early-type galaxies to show that older galaxies 
have smaller half-light radii R e and larger velocity dispersions a than younger ones 
of the same stellar mass M star - We use the age-corrected luminosity Lf" as a proxy 
for M s tar to minimize biases: below L™ 1T ^ 10 11 L Q , galaxies with age ~ 11 Gyrs 
have R e smaller by 40% and a larger by 25%, compared to galaxies that are 4 Gyrs 
younger. The sizes and velocity dispersions of more luminous galaxies vary by less than 
15%, whatever their age, a challenge for current galaxy formation models. A closer 
check reveals that the lowering in the dispersion is caused by older galaxies that show 
a significant departure from the R e -L C ° TX and er-L™ rr relations at high L c ° Tl '. Such 
features might find an explanation in models where more massive galaxies undergo 
more minor mergers than less massive galaxies at late times, thus causing a break 
in the homology. In terms of the Fundamental Plane of early- type galaxies, the data 
indicate that all galaxies show a significant and similar increase in the dynamical- 
to-stellar mass ratio with increasing mass, independent of their age. However, older 
galaxies have smaller Md yn / M siax ratios than objects which formed more recently. 
These findings may suggest that lower mass galaxies and, at fixed stellar mass, higher 
redshift galaxies, formed from gas-richer progenitors, thus underwent more dissipation 
and contraction. 
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1 INTRODUCTION 

The formation and evolution of galaxies is still hotly 
debated. The cooling of baryons in dark matter ha- 
los should form compact and dense se lf-supporting, ro- 
tatin g stellar and gaseous disks (e.g., Fall fc Efstathiou 
Il98d : iNavarro fe Steinmet j l2000l : iGovernato et all 120071) . 
Later major mergers between disk galaxies have then 
been proposed as the main routes to form ellipti- 
cal galaxies (e.g., iToomre fc Toomrel ll972iT Several de- 
tailed numerical simulation s (e.g ., iBarnes fc Hernquistl 
19911: iBovlan-Kolchin et all 12009; iDekel fc Coxl 12009": 
Robertson et al.l 120061 : iBurkert et alj |200S| ; iHopkins et alj 
20081 ) have shown that many dynamical and photometri- 



cal properties of the remnant spheroidal galaxies can be ex- 
plained simply in terms of the merging of progenitors having 
varying levels of gas-richness. Galaxies which form from gas- 
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rich, dissipative, mergers result in more compact remnants 
with larger velocity dispersions. 

On the oth er hand, in a pure m onolithic model of galaxy 
formation (e.g-. lEggen et all l962). stars are formed in a sin- 
gle burst of star formation from gas falling towards the cen- 
ter, and the evolution is passive thereafter. Although there 
is clear evidence for a r ed and dead p opulation of massive 
early-type galaxies (see iRenzi ni 2006 for a review) , hierar- 
chical merging could still have played some role at late times. 
The metallicities of typical early-type galaxies are well re- 
produced in model s with frequent minor mergers at moder- 
ate r edshifts (e.g.. iBournaud et alj [20071 : iNaab fc Ostrikerl 
120091) and are not much affected by later dry mergers 
( Pipino fc Matteuccill2008l ). The sizes and velocity disper- 
sions of BCGs in the local Universe are evolving in a man- 
ner which sug gests frequent m inor dry mergers as recently 
as 1 Gyr ago IjBernardil 120091) . The clustering and number 
density of massive galaxies in the Sloan Digital Sky Survey 
(SDSS), the 2dF-SDSS LRG and QSO Survey (2SLAQ), the 
NOAO Deep Wide-Field Survey and in DEEP2 also suggest 
that some merging events involving massive galaxies must 
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have occurred since redshift z ~ 1 ( e.g., iBundv et ai1l2007l; 
IWhite et aT1l2007l ; IWake et alj|2008h . but that the majority 
of the stellar mass had already been assembled by this time. 

In addition, there is now growing evidence that mas- 
sive galaxies at z ~ 2 are much smaller and denser than 



their local counter 


Darts of the same stellar mass (e.g., 


Truiillo et al.ll2006l; 


van Dokkum et al.ll2008l; ICimatti et all 


2008; Saracco et al. 


2008). These observations are in line 



with the idea that high-redshift galaxies formed in a denser 
universe, and therefore from baryonic clumps which col- 
lapsed in denser, more gas-rich environments, which, in 
turn, induced more dissipation, more compact remnants 
and higher velocity dispersions. However, similarly compact 
galaxies to those observed at high-redshift do not exist in 
the local universe, raising the question of what process or 
processes have acted to increase the sizes of these objects 
to make th em consistent with the la rger sizes we see at late 
times (e.g.. Ivan Dokkum et al.ll2008T ). 

In this Letter, we present evidence that the sizes of 
early-type galaxies are difficult to reconcile with a pure 
monolithic model, at least for the most massive galaxies. 
In § [3 we describe the data set and present the measure- 
ments on which our conclusions are based. We discuss our 
results and their implications in § U Where necessary, we 
set the cosmological parameters f2 m = 0.30, S7a = 0.70, and 
h = Ho/100 kms -1 Mpc -1 = 0.7. 



2 DATA 

We use the SPSS - based s ample of early-type galaxies from 
Irlvde fc Bernardi (|2008aT ) who give a prescription for how 
to correct the SDSS photometric parameters for known sky 
subtraction problems which affect objects with large appar- 
ent brightnesses. The sample, which contains about 48,000 
early- type galaxies, is distributed within the redshift range 
0.013 < z < 0.3, which corresponds to a maximum lookback 
time of 3.5 Gyr. The galaxies in the sample have apparent 



magnitudes 14.5 



17.5 (based on deVaucouleur fits 



to the surface brightness profiles), axis ratios b/a > 0.6, and 
ages greater than 2 Gyr. To study the bulk of the early-type 
population of local galaxies, we remove the Brightest Clus- 
ter Galaxies (BCGs) from our sampl e, as they migh t have 
had unusual formation hi stories (see [Be rnardi 2009). This 
was done by mat ching the Hyde fc Bernardi! (2008a) s ample 



to th e maxBCG dKoester et al.ll2007h and C4 i|MiHer et all 
120051 ) cluster samples (this procedure should remove most 
of the BCGs although some contamination could still be 
present) . Our results do not vary significantly if BCGs were 
not excluded. The luminosities we report are estimated by 
first correcting t he absolute magnitudes for evolution to 
2 = (following iHvde fc Bernard l2008al . we add 0.9z to 
the observed magn itudes), and then se tting log(L r / Lq) = 
-0.4(M r - 4.62) (|Blanton et al.l l200ll ). Est imated stellar 
masse s and ages for these objects are from iGallazzi et al.l 
|2005l ). 



3 RESULTS 

We divide the sample into bins of different ages, as labeled 
in the top left panel of Fig. [T] chosen so as to provide at 



least 2,000 galaxies in each bin. For each age bin, we further 
divide the sample into nine equally-spaced bins in luminos- 
ity, and plot only bins with more than 5 galaxies. We com- 
pute the median size, velocity dispersion, and mass-to-light 
ratio in each bin of redshift and luminosity. Uncertainties 
on these median quantities are approximated by the square 
root of the variance divided by the square root of the num- 
ber of sources in the bin. Finally, we only consider galaxies 
older than 5 Gyrs in our final sample, in order to minimize 
contamination from objects for which the luminosity- and 
mass-weighted ages might differ substantially. 

The top panels in Figure Q] show the R e —L r and o—L r 
relations for galaxies of different ages. For L r < W 11 Lq, 
lines of constant age run parallel to the a—L r relation, 
with older galaxies being offset to larger a. This is re- 
markably consistent with previous expectations which were 
based on a very diff erent analysis l|Forbes fc Ponmanlll999l ; 
iBernardT et al.l l2005). Even more remarkable is the fact that 
the R e —L r relation is almost independent of age. The age 
estimate is noisy, so one might have worried that this has 
erased any age-dependence in the R e —L r relation. However, 
this is unlikely to be the case, since the a—L r relation does 
vary for galaxies of different age. 

Because the luminosity changes as the stellar popula- 
tion ages (typically as L r oc t~ ' 75 ), we would have liked to 
replace L r with the stellar mass M s tar, and study the R e - 
Mstar and a-M sta , T relations instead. However, such plots are 
complicated by the fact that the SDSS is magnitude limited 
and because the age and M sta r estimates are highly corre- 
lated. The flux limit will tend to select brighter sources at 
fixed M s tar, thus scattering sources to lower Af st ar/ L ratios 
and to younger ages with respect to their true ones, given 
the intrinsic correlations in the errors. Therefore, the old- 
est objects corresponding to a given M sta r may be missing. 
However, the correct spread in ages at fixed L is reproduced 
if luminosities are taken as independent variables (we refer 
the reader to t he full analysis presented in Appendix A of 
Bernardi 2009 for more discussion). Therefore, we correct 
each luminosity for its fading with age by setting 

logL™ rr = logL r + 0.751og(t/5.5 Gyrs), (1) 

where t is the age. iBernardil (|2009l ) shows that L r ° TT defined 
in this way is a good proxy for M s t ar . 

Fig. lc shows the R e -L r ° rr relation for galaxies of differ- 
ent ages. To produce Figs, lc and Id we first rebin the whole 
sample in the new grid of luminosities L r OIT and then recom- 
pute the correlations, although we also note that a simple 
median rescaling of the curves in Fig. la and lb via Eq. fTJ 
would yield similar results. In contrast to the R e -L r rela- 
tion shown in Fig. la, there is now a clear trend with age: 
at fixed L£° rr cxM s tar, younger galaxies tend to have larger 
sizes. At log(L™ r 7L0) ~ 10.5, the offset is ~ 0.15 dex; it 
decreases to < 0.1 dex at higher L r ° lr . Fig. Id shows in- 
stead that the a—L r OTT relation is less age dependent than 
the o - L r relation in Fig. lb. At log(L^ or 7L Q ) ~ 10.5, the 
spread in velocity dispersions is < 0.1 dex, and it decreases 
to < 0.05 dex at larger L r OTT . Above log(L^ or 7L Q ) ~ 11, 
the relation is curved - older galaxies have a lower cr than 
expected from extrapolating the a-L r ° TT relation defined at 
lower luminosities to higher L r ° rr . More specifically, we find 
that a fit to the sample of young galaxies with age t < 6 Gyr 
yields a slope in the R e -L r ° rl relation of ~ 0.52 (long-dashed 
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Figure 1. Top: size R e (left) and velocity dispersion a (right) as a function of r-band luminosity L r , for galaxies of different ages, as 
labeled. Bottom: Same as top panels, but after correcting luminosity for age effects, so that it is a proxy for stellar mass. The long-dashed 
lines in Figs, lc and Id are the fits to the R e -L^° TT and o-L^° vv correlations to the subsample of galaxies with age t < 6 Gyr, displaced 
in normalization t o match the locus defined by older galaxies. The dotted line in Fig. fc, with arbitrary normalization, has a slope of 
0.56 as derived bv lShen et alj (2003) fitting the i?, e -L£ orr relation to the whole sample of SDSS early-type galaxies. 



line in Fig , lc), v ery close to the slope of 0.56 derived by 
IShen et alj l|200ot ) fitting the R e -L c r 011 relation to the whole 
sample of SDSS early-type galaxies (dotted line). (Both the 
dashed and dotted lines in Figs, lc and Id are displaced in 
normalization to match the locus defined by older galaxies.) 
Fig. lc shows that very old (t > 9 Gyr) galaxies tend to fol- 
low an R e -L^? TT relation with a very similar slope, although 
more massive (log(Lr° rr /L©) > 11) systems show a system- 
atic and significant deviation, of up to Alogi? e ~ 0.1 dex, 
from the extrapolation of such a straight line. Analogously, 
Fig. lc shows that the same subsample of old and massive 
galaxies, also show a significant departure of A log it ~ 0.05 
dex above log(LJi orr /L0) > 11 from the straight, long-dashed 
line of slope ~ 0.29, derived from fitting the a-L^ 011 relation 
calibrated on younger galaxies only. In the overall, this cur- 
vature is similar to that found f or BCGs, for which it is 
even more evident [Bernardil (|2009l ). We stress here that this 
gradual steepening and corresponding flattening in the rela- 
tions, are present in galaxies of a fixed age, thus mirroring 
a clear break in the homology when moving from lower to 
more massive systems. 



4 DISCUSSION AND CONCLUSIONS 

In the simplest galaxy evolution models, the age of the stel- 
lar population reflects the time of assembly of the galaxy. 
Hence, older galaxies are expected to have smaller sizes and 
larger velocity dispersions than their younger counterparts 
of the same M sta r, both because the high-redshift Universe 
was denser, and because the objects at that time are thought 
to have formed from gas-richer progenitors. However, the dif- 
ferences observed between the sizes of old and young galax- 
ies in our sample are far less than what expected given the 
evolution R e oc (1 + z) _1 at fixed stellar mass which would 
result if the galaxy density is proportional to the density of 
the universe. For example, galaxies as old as 12 Gyr (z ~ 4), 
should be displaced by Alog7? e ~ 0.5 (i.e., a factor of ~ 3) 
downwards with respect to the younger galaxies in our sam- 
ple. So the absence, in our data, of a strong age-dependent 
trend with size, rules these models out. 

A more elaborate model postulates that although the 
sizes were initially smaller (so as to be consistent with 
the z ~ 2 observations mentioned in the Introduction), 
they have since ev olved, whi l e the stellar mass has re- 
mained unchanged (iFan et al.1 120081 ) . This model exploits 
the fact that the epoch when early-type galaxies were 
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forming s tars is close to that when AGNs were most ac- 
tive (e.g., Cattaneo fc Bernardil I2OO3I; Granato et alj|2004l: 
lHaiman et al.ll2007l : IShankar et al.ll2008l . l2009al >- So. in this 
model, AGN activity is assumed to expel gas from the cen- 
tral regions; the sudden reduction of mass in the core makes 
the surrounding stellar distribution puff up, increasing the 
size. Because the objects are assumed to eventually settle 
back into virial equilibrium at these larger sizes, with no 
change in mass, this model also predicts that the velocity 
dispersions decrease from their initial values, but that the 
age-dependence in the (J-M star relation is not erased. 

The Fan et al. (2008) model was calibrated to recon- 
cile the differences between the z ~ 2 and local 7? e -M st ar 
relations. At lower M star , it predicts that younger galaxies 
should be larger by ~ 0.15 — 0.2 dex, in good agreement with 
our Fig. lc. At higher masses, we find an offset of < 0.1 dex, 
which is less than the ~ 0.3 dex they predict. They also pre- 
dict that a should be larger for older galaxies: they find an 
increase of ~ 0.15 dex in a when moving from younger to 
older galaxies at large masses, with slightly smaller trends 
at lower masses. While we are in qualitative agreement with 
their predictions, our results point to a much smaller offset, 
especially at \og(Lr° TT /Lq) > 11. However, given the system- 
atic uncertainties in computing the profile and luminosity- 
dependent normalization coefficients in the virial relations 
(see Fan et al. 2008 for details), it is difficult to make detailed 
comparisons with their predictions. Only ad-hoc numerical 
simulations will be able to further probe their model. 

In hierarchical models, the stellar population can be 
older than the time at which the total mass was assem- 
bled into one system (e.g. jBower et aDl2006l : lDe Lucia et al.l 
2006). This is accomplished by making dense progenitors 
through wet (gas rich) mergers at high redshift, followed 
by a sequence of dry, dissi pationless mergers wh ich serve to 
reduce the densities (e.g.. iKormendv et al.ll2008l ). However, 
this evolution in sizes cannot be directly tested from Fig. 
lc, which shows galaxies only in their present form. On the 
other hand, the observed break in the a-L r OTT relation (Fig. 
Id), when combined with the steepening in the R e -L r ° TT 
relation (Fig. lc), might be a signature that minor merg- 
ers played a role in the mass assembly of at least the most 
massive objects. 

Preliminary results from Shankar et al. (2009b, and ref- 
erences therein) show that major dry mergers are uncommon 
for intermediate mass galaxies, with even the most mas- 
sive galaxies experiencing one such event at most. Early- 
type galaxies, born at 2 ~ 2 and with z — mass M s t ar > 
3 x 1O 1O M0, undergo at least 3 — 7 minor dry mergers, with 
the number increasing with M atar . As sket ched by Bern ardi 
(2009), and theoretically discussed by, e.g.. lCiottil (|2008l . and 
references therein), repeated minor mergers of mass ratio 
/ < 1 can enable the remnants to increase their masses by 
a factor (1 + /), the sizes by a factor of (1 + 2/), and to 
decrease a 2 by a factor of (1 — /), thus without changing 
the virial product a 2 R e much. For example, even 5 minor 
mergers with / as low as / « 0.2, would be capable of in- 
creasing the sizes by a factor of ~ 5, and the stellar mass 
by just a factor of 2.5. Ad-hoc, recent numerical simulations 
are s howin g that this can actually be possible iNaab et"al] 
(e.g., [2009, and references therein). This steep and fast evo- 
lution in the R e -M sta , r plane is another way to efficiently 
puff-up the high-z, compact galaxies. The main challenge 
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Figure 2. Dynamical mass-to-light ratio as a function of age- 
corrected luminosity for galaxies of different ages, as labeled. Ir- 
respective of age, we find that all galaxies show a significant tilt. 



for hierarchical models would then be to grow all galaxies 
of different ages coherently on a similar size-luminosity re- 
lation, as we see it today (see Shankar et al. 2009b). Also, 
hierarchical models tend to produce too large size disper- 
sions at fixed galaxy luminosity, at variance with our data 
jGonzalez et al.ll200Sl , but see also Khochfar & Silk 2006). 
Nevertheless, a late evolution driven by minor mergers in 
the most massive and older galaxies, that preferentially sit 
in richer environments, might explain the gradual steepening 
of the size- mass relation at high L r °" and the corresponding 
curvature at high a. 

Before concluding, we would like to discuss our find- 
ings in the con text of the Fundamental P lane relation R e oc 
a ajb (pp^ e g |Dj orgovs ki fc Davis! Il987l . Observations sug- 
gest that a ~ 1. 43 ± 0.05 and b = -0. 79 ± 0.02 with 
small scatter (e.g.. iHvde fc Bernardi l2008bl ). The fact that 
(a, 6) 7^ (2, —1) is sometimes called the 'tilt', and is thought 
to re flect the fact that Md V n/L r or M 3 t ar /L r is not co nstant 
(e.g.. iD'Onofrio et al.l [20061 : IHvde fc Bernardil l2008bh . The 
idea is that if 



2 M r 
a oc 



dyn 



Rc 



M< 



I r Re L/J, I r R e 



then 

Re OC CT 2 /(27+l) J -C-y+l)/C27+l) i 



(2) 



(3) 



with I r the surface brightness of the galaxy. Previous work 
has shown that 7 > if L r is the optical luminosity. In 
the discussion which follows, we consider the effect of re- 
placing L r with L r ° TI . Figure A5 in Bernardi (2009) shows 
that M Btar /L r ° rr vs L r ° TT is flat for these galaxies, so this 
should be equivale nt to studying the stella r mass Funda- 
mental Plane (e.g. jHvde fc Bernard1l2008bj ). 

Fig. 2 shows M dyn /L^ orr versus L c °" , for galaxies of 
different bins in formation time. (We define the dynamical 
mass as (Af dyn /M ) = 10 10 (cr/200 kms -1 ) 2 (ii e /h _1 kpc), 
and only show bins in which there were more than 100 galax- 
ies.) It is clear that the relation is not flat: except for the bin 
with the most recent formation time (which may be contam- 
inated by selection effects and/or errors in age of the type 
discussed by Bernardi 2009), the 'tilt' is 7 ~ 0.13 (long- 
dashed line in Fig. 2), and it is approximately independent 
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of age. Und er the reasonable ass umption of a universal DM 
profile fe.g.. lNavarro et a"flll997r ), the tilts reported in Fig. 2 
suggest that less massive galaxies are more concentrated, 
possibly due to more dissipation, thus inducing more con- 
traction and a lower dark matter fraction within R e . Note 
that, as discussed in § [21 all Re in SDSS are calibrated with 
deVaucouleur fits (i.e., with a Sersic index n = 4), thus the 
non-homology seen in Fig. 2 should principally derive from 
actual dynamical mass variations with stellar mass and not, 
for example, from non-homology effects in the light distri- 
butions fe.g.. [Tbrtora et al.ll2009l . and references therein). 
Moreover, Fig. 2 also shows that, at fixed LJ; orr , galaxies 
which formed earlier tend to have smaller Md yn / L^° TT than 
younger ones. This offset is mainly driven by the smaller 
sizes R e associated to older galaxies, as expected if the lat- 
ter were formed in a denser and gas-richer environment. 

Finally, note that setting 7 = 1/4 in equation © would 
make the Fundamental Plane relation R e oc <r 1,33 /<^,? r 83 for 
populations of a fixed formation time. However, because 
of the dependence of the zero-point of the Mdyn/L 1 ^ 011 — 
L™ rr relation on formation time, the slope 7 becomes shal- 
lower if one averages over a range of formation times. A 
smaller value of the tilt 7 means that the FP coefficient a 
should be larger when one averages over the full early-type 
population than when one restricts the study to a small 
range of formation times. All the galaxies in a cluster tend to 
have similar formation times (e.g. Bernardi 2009). This sug- 
gests that the FP computed for a single cluster should have 
greater 'tilt' (the coefficient a should be further from 2) than 
the FP for the full populati on. So it is interesting th at a ~ 
1.6 for the full population (|Hvde fe Bernardill2008bT ). Per- 
haps this is why the traditional FP, with I r instead of 7 C orr, 
has a ~ 1.43 ± 0.05 for SDSS early-types (|Bernardi et al.l 
l2003l ; lHvde fe BernardilkOQSbh . whereas a ~ 1.24 ±0.07 for 
the Coma cluster is smaller (e.g.. Ijorgensen et al.lll996h . 
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